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Abstract 

In this report, we explore the use of a quantum optimization algorithm for obtaining low en- 
ergy conformations of protein models. We discuss mappings between protein models and opti- 
mization variables, which are in turn mapped to a system of coupled quantum bits. General 
strategies are given for constructing Hamiltonians to be used to solve optimization problems of 
physical/chemical/biological interest via quantum computation by adiabatic evolution. As an ex- 
ample, we implement the Hamiltonian corresponding to the Hydrophobic-Polar (HP) model for 
protein folding. Furthermore, we present an approach to reduce the resulting Hamiltonian to 
two-body terms gearing towards an experimental realization. 



PACS numbers: 87.15.Cc, 03.67.Ac, 05.50.-hq, 75.10.Nr 



I. INTRODUCTION 



Finding the ensemble of low-energy conformations of a peptide given its primary sequence 
is a fundamental problem of computational biology, commonly known as the protein folding 
problem [H Ej [3l IH El El [7j . The native fold conformation is usually assumed to correspond to 
the global minimum of the protein's free energy (according to the so-called thermodynamic 
hypothesis [8]), although some exceptions have been proposed |9l |T0]. Thus, the protein 
folding problem can be described as a global optimization problem. Algorithms for quan- 
tum computers have been developed for many applications such as factoring [TT] and the 
calculation of molecular energies [12]. In this report, we investigate the approach of using 
an adiabatic quantum computer for folding a highly simplified protein model. 

The HP (H: hydrophobic, P: polar) lattice model [13] is one of the simplest protein models 
implemented. Still its accuracy in predicting some of the folding behaviour of real proteins 
has made it a useful benchmark for testing optimization algorithms such as simulated an- 
nealing Pl] , genetic algorithms [ISl [ISl HH UHl US] , and ant colony optimization [20] • Other 
heuristic methods such as hydrophobic core threading [21], chain growth [221 [23], contact 
interactions [24], and hydrophobic zippers [25] have also been considered. The HP model 
has also been useful for a qualitative investigation of the nature of the folding process and 
the interactions between proteins. The HP model depicted in Fig. [T] is defined by three 
assumptions: 1) There are only two kinds of amino acids or residues, hydrophobic (H) and 
polar (P); 2) residues are placed on a grid (typically a square grid for the 2D model and 
a cubic grid for the 3D model); 3) the only interaction among amino acids is the favorable 
contact between two H residues that are not adjacent in the sequence. The energy of this 
interaction is defined as -1 in arbitrary units, representing a hydrophobic effect which tends 
to fold the protein in a way that aggregates the H residues in a predominantly hydrophobic 
core, and leaves the P residues at the surface of the protein. The search for the native 
conformation of the protein is represented by a self-avoiding walk on the grid. 

An important property of the model is that the number of possible conformations is 
roughly proportional to 2.7^ [13], where N is the length of the polypeptide chain. Proofs 
of the NP-completeness of both the 2D and 3D HP models have been given [261 EZ] • Due 
to this exponential growth, global optimality proofs become impractical when reaches 
approximately 50 residues. For longer sequences, heuristics and stochastic algorithms have 
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FIG. 1: (Color online) The lattice protein hydrophobic-polar (HP) model, showing the global energy 
minimum conformation for a sequence of 24 amino acids, HHPHPPPHHHHPPHHHHPPPHPHH 
(E = —12). Blue (dark grey) beads represent hydrophobic residues (H) and orange (light grey) 
beads represent polar residues (P). The model consists of a self-avoiding chain with favorable 
{E = —1) energetic interactions among hydrophobic residues in contact. Contact between nearest 
neighbors in the primary sequence are unavoidable, and their contribution is not added to the 
calculated energy. Black dots represent lattice sites. Dotted lines represent favorable energetic 
interactions, solid lines represent the self-avoiding chain. 

been employed for N up to 136 for the 3D HP model [21]. 

This report is structured as follows. Sec. |TT] presents the general quantum algorithm and 
the terms of the Hamiltonian necessary to obtain the folded structure of the protein, and 



describes how to map the problem to arrays of coupled quantum bits [2H1 ES]- Sec. Ill 



explains the construction of the core component of the algorithm, the Hamiltonian that 



encodes the lowest energy conformation of the protein. In Sec. IV we solve in detail the four 
amino acid sequence HPPH in a two-dimensional grid. In Sections |V] and VI we discuss 
the resources necessary to carry out the reduction from a general k-hodj Hamiltonian to a 
two-body Hamiltonian, as a function of the size of the protein. 



II. AN ADIABATIC QUANTUM ALGORITHM FOR THE HP MODEL 

We begin this section by describing the mapping of a sequence of amino acids into 
binary variables, which will in turn be mapped to spin variables in the quantum mechanical 
version of the algorithm. 
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A. Mapping amino acids onto a lattice 



The mapping of the coordinates of a sequence of amino acids to a given grid of size 
N X N is developed as follows. We assume, without loss of generality, that the number of 
amino acids is a power of 2. A binary representation for the labels of the grid requires logj 
binary variables to specify the position of an amino acid in each dimension, as shown in Fig. |2} 
The position of each of A^ amino acids in a Z)-dimensional lattice may thus be encoded by 
a bit string q composed of exactly DA^log2 A^ binary variables g,. For example, for A^ = 4, 
D = 2, the length of the bit string q is 16 and therefore the number of configurations that 
can be explored is 2^®. Let q denote a particular configuration of the protein in the grid, 
written in the form 

q = gie^is ^14^13 quQn QioOg (Mt^ (Mi^^ (1) 

J/4 X4, y-i X3 J/2 X2 yi Xi 

where Xi and i/i are the x and y coordinate of the i-th amino acid. Fig. [2] shows an example 
of the coordinate mapping given a specific sequence of residues or amino acids. 

In the quantum version of the problem, these configurations span a Hilbert space of 
dimension 2^^. The state vectors can be written as 

Iq) = Iqie) Iqw) ■ ■ ■ \q2) \qi) ■ (2) 

We wish to implement a Hamiltonian which encodes the ground state of the protein on a 
spin-1/2 quantum computer [30], or, in particular onto an Ising-like Hamiltonian with a 



transverse magnetic field [3T] (see Sec. II B). To do so, we realize the 16-qubit Hilbert space 
as a system of 16 spin-1/2 particles, with \qi = 0) mapped to the spin state = +1) and 
\qi = 1) mapped to |(jf = —1), with these spin states as the computational basis. In other 
words, the quantum version of the configuration states is related to spin variables through 
the transformation 

q^^l{I-a:), (3) 

with / = ( 1 ) and = ( J -i ) > the identity operator and the Pauli matrix represented 
in the computational basis, respectively. 
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FIG. 2: (Color online) Grid-labeling conventions for a sequence of 4 amino acids, HPPH. (a) 
Amino acids 2 and 3 are fixed in the center of the grid to eliminate translational degeneracy, (b) 
One of the possible invalid configurations that might arise in the search and that would need to be 
discarded by the optimization algorithm, (c) Lowest-energy conformation for this example. The 
dotted line between amino acids 1 and 4 represents the hydrophobic interaction favored by the 
HP model. The configurations to optimize assume the form q = qwqi^quqis 0110 0101 q4:q3q2Qi, 
where the set of variables qwqi^ququ and (?4(?3(?2'?i determine the position of amino acids 4 and 1, 
respectively. For the particular case in (b), q = 1100 0110 0101 1011. 



In Sec. Ill we will derive an energy function in terms of the ND log2 binary variables 
used to describe all of the possible configurations for the amino acids in a D-dimensional 
lattice. This energy function is constructed so that its minimum will yield the lowest-energy 
conformations of the protein. Eq. |3] provides the rule for the mapping of this energy function 
to a quantum Hamiltonian. Each qi in the energy function will be replaced by an operator g,. 
The operator qi is to be understood as a short hand notation for a quantum operator acting 
on the i-th qubit of the ND\og2N multipartite Hilbert space, TCNDiog2N ® 'HNDiog2N-i ® 
■ ■ ■ ^Tii® ■ ■ ■ ® Ti-i. The explicit form of qi is given by / ® / ■ ■ ■ ® ■ ■ ■ ® /. Notice 
that the operator qi as defined in Eq. |3] has been placed in the i-th position, and the identity 
operator acts on the rest of the Hilbert space. Products of the form qiqj will be replaced by 
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a quantum operator qiqj, which is a shorthand notation for the operators qi and qj acting 
on the i-th and the j-th qubits, respectively. As an illustrative example, consider an energy 
function dependent on four binary variables, 

E{qi, g2, gs, 94) = 1 - qiq2 + gigs + g2g3g4, 

which will be mapped to a Hamiltonian acting on a four qubit Hilbert space, 7-^4 ® Ti^ ® 
'H2 ® Til- In the instance of this particular energy function the Hamiltonian will assume the 
form 

H=I®I®I®I-I®I®q®q+I®q®I®q+q®q®q®I 

= I - qiq2 + gigs + g2g3g4- (4) 

Following this mapping, transformation of any energy function to the quantum Hamilto- 
nian is straightforward. 

In order to eliminate redundancy due to translational symmetry, we fixed the two middle 
amino acids in a central position (see Fig. [2]). This reduces the number of binary variables 
in the bit string from sixteen to eight. The variables corresponding to amino acids 1 and 
4: g4g3g2gi and giegisgugis, respectively, become the variables of interest, and the variables 
%17%15 and gi2giigiog9 corresponding to amino acids 2 and 3, become constant throughout 
the optimization process. In general, the {N/2y^ amino acid is assig ned to the grid 
point in all D dimensions. The {N/2 + amino acid is fixed to the {N/2 + 1)*^ grid point 
in the x direction and to the (A^/2)*^ grid point in all other D — 1 dimensions. As shown in 
Fig. |2| the final configuration we will try to optimize for the case of four amino acids takes 
the form |g) = Igiegisgugia) |0110) |0101) |g4g3g2gi)- 

B. Adiabatic Quantum Computation 

The goal of an adiabatic quantum algorithm is to transform an initial state into a final 
state which encodes the answer to the problem. A quantum state \ip{t)) in the 2"-dimensional 
Hilbert space for n qubits, evolves in time according to the Schrodinger equation 

\m) = m mt)) , (5) 

where H{t) is the time-dependent Hamiltonian operator. The design of the algorithm takes 
advantage of the quantum adiabatic theorem [32], which is satisfied whenever H{t) varies 
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slowly throughout the time of propagation t E [0, r]. Let lipgit)) be the instantaneous ground 
state of H{t). If we construct H(t) such that the ground state of H{0), denoted as \iljg{0)), is 
easy to prepare, the adiabatic theorem states that the time propagation of the quantum state 
will remain very close to {ipgit)) for all t G [0,r]. One way to choose H{0) is to construct 
it in such a way that \ipg{0)) is a uniform superposition of all possible configurations of the 
system, i.e. 

1^9(0)) = i Yl 1^") ■ ■ ■ 1^2) 111) (6) 

^ <?»e{o,i} 

summing over all 2" vectors of the computational basis. Notice that an initial Hamiltonian 
of the form 

n 1 

i=i i=i 

would have as a non-degenerate ground state the vector |^/'g(0)) defined in Eq. [6] 
Similarly to the operator q from Eq. [3} we define 

ql^lil-cr:), (8) 

with / = (li) and = ( ? J ) , the identity operator and the cr'^-Pauli matrix represented 
in the computational basis, respectively. 

For example, for the case of four qubits, n = 4, H{0) is given by, 
4 

H{0) = J2'i = ^l + € + Ql + t (9) 

i=l 

= J(g)J(g)J(g)g^ + J(g)J(g)g^(g)J + J(g)g^®J(g)J + g^(g)J(g)J(g)J. (10) 

To find the lowest energy conformation of the protein, one defines a Hamiltonian, Hprotein, 
whose ground state encodes the solution. Adiabatic evolution begins with H{0) and l^pgiO)), 
and ends in Hprotein = H{t). If the adiabatic evolution is slow enough, the state obtained 
at time t = r is \ipg{T)), the ground state of H{t) = Hprotein- The details about the 



construction of Hprotein will be provided in Sec. Ill A possible adiabatic evolution path can 



be constructed by the linear sweep of a parameter t G [0, r], 

H{t) = (1 - t/r)H{0) + {tlT)Hprotern. (H) 



Even though Eq. [llj connects -ff(O) and Hprotein-, determining the optimum value of r is an 
important and non-trivial problem in itself. In principle, the adiabatic theorem states that 



over sufficient adiabatic time r, the state will converge to the solution to the problem 

\ipg{T)). The magnitude of r dictates the ultimate usefulness of the quantum algorithm 
proposed in this work. Farhi et al. [531 135 showed promising numerical results for random 
instances of the Exact Cover computational problem. 

Notice that the parameter r determines the rate at which H{t) varies. Following the 
notation from Farhi et al [33], consider H{t) = H{t/T) = H{s), with instantaneous values 
of H{s) defined by 

H{s)\l;s) = E,{s)\l;s) (12) 

with 

Eo{s) < Ei{s) < ■ ■ ■ < En-i{s) (13) 

where N is the dimension of the Hilbert space. According to the adiabatic theorem, if the 
gap between the two lowest levels, Ei{s) — Eo{s), is greater than zero for all < s < 1, and 
taking 

r » (14) 

"mm 
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with the minimum gap, fi'^m' defined by 



and e given by 



then we can make 



min(Ei(.)-Eo(s)), (15) 

0<s<l 



dH 

max |(/ = — 1/ = 0; s) I, (16) 

o<s<i ' ds 



\{l = 0;s = mT))\ (17) 

arbitrarily close to 1. In other words, the existence of a nonzero gap guarantees that {ipit)) 
remains very close to the ground state of H{t) for all < t < r, if r is sufficiently large. 
In the following sections, we derive the expression for an energy function which is mapped 



to Hprotein usiug the proccdurc explained in Sec II A The final expression for Hprotein cor- 
responds to an array of coupled qubits. We use H to denote both the Hamiltonians and 
the energy functions given that the mapping is straightforward as explained at the end of 
Secina 
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III. CONSTRUCTION OF THE LATTICE PROTEIN HAMILTONIAN FOR ADI- 
ABATIC QUANTUM COMPUTATION 

Our goal in this section is to find an algebraic expression for an energy function in 
which the ground state represents the lowest energy conformation of a protein. Ideally, this 
energy function should contain the least possible number of terms. In order to optimize 
the computational resources, we desire terms with low locality, defined as the number of 
products of gj's that appear in a certain term (e.g., a term of the form h = qiq'iq^q^ is 
4-local). 



A. Small computer science digression 



Encoding positions of the amino acids in the grid in terms of Boolean variables makes it 
very convenient to use tools from computer science and basic Boolean algebra [35j. In this 
section, we will review these tools before using them to contruct arbitrary Hamiltonians that 
encode the spectrum of statistical mechanical models. We begin with some simple relations 
that are useful in the derivation of the Hamiltonian terms. 

Consider two Boolean variables x and y. Expressions for the operations and, or, not 
can be written algebraically as: 

fAND{x,y) = xy AND operation {x A y) 

/or(3;, y) = X + y — xy OR operation {x V y) 

fNOT{x) = 1 — X NOT operation (-ix) 

An additional useful Boolean operator for the construction of Hamiltonian terms is XNOR. 
The output of the XNOR function is unless all its arguments have the same value. The 
two-input version XNOR operation is also known as logical equality, here denoted as EQ, 

Jeq^x, y) = 1 — X — y + 2xy XNOR operation(a; EQ y) 

The XNOR operator can be used to construct a very useful term for statistical mechanics 



Hamiltonians, an on-site repulsion penalty (described in Sec. HI B and illustrated in Fig. k5 
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FIG. 3: (Color online) Illustrative example of one of the uses of the XNOR Boolean function in our 
scheme for the construction of Hamiltonians. Consider two particles 1 and 2 that are restricted 
to occupy either position or 1 in the dimension shown, and let xi and X2 encode the position 
particle 1 and particle 2 respectively. The Boolean function Jeq can be interpreted as an onsite 
repulsion Hamiltonian which penalizes configurations where xi = X2- The possible configurations 
are encoded in the bit string x = xiX2- 

B. Hamiltonian terms for protein folding: the HP model 

Most of the configurations represented by the bit strings q of Eq. [T] are invalid protein 
states. We seek a Hamiltonian that energetically favors valid configurations of the HP 
model by eliminating configurations in which more than one amino acid occupy the same 
grid point, and discarding configurations that violate the primary sequence of amino acids. 
This Hamiltonian can be written as 

H protein H onsite ~l~ -^psc ~l~ Hpo^i^wisei ("^^) 

where H onsite is an onsite repulsion term for amino acids occupying the same grid point, 
Hpsc is a primary sequence constraint term, and Hpairwise is a pairwise interaction term that 
represents favorable hydrophobic interactions between adjacent hydrophobic amino acids. 

Each protein configuration can be described by a string of ND log2 N bits, where D is 
the number of dimensions and is the number of amino acids. Without loss of generality. 



is here contrained to be a power of two. Below, we describe each term in Eq. 18 
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1. Onsite term, Honsite 



The first term in Eq. 18, Honsite-, prevents two or more amino acids from occupying tfie 
same grid point. For a given protein, at least one position variable must differ between 
each pair of amino acids for HonsUe to evaluate to zero. As an illustrative example, a simple 
one-dimensional two-site Hamiltonian is shown in Fig. [3] using the XNOR operation described 
in Sec. UTTAl 

The general term for D dimensions and amino acids is 

N-l N 

HonsUeiN, D) = \oJ2Yl HLueiN, D) (19) 

i=l j=i+l 



with 



KLiteiN, D)= n n ~ - Qfm+r 



and 



fc=l r=l 

+2 qf{i,k)+r Qf{j,k)+r ) (20) 



fit, k) = D{i - 1) log2 N+{k-l) log2 N. (21) 



The terms enclosed by the parentheses in Eq. [20] are XNOR functions. The double product 
of these terms tests that all of these conditions are considered simultaneously by using and 
relations. If all the binary variables describing the coordinates of the z-th and j-th amino 
acids are equal, then the series of products of XNOR functions is evaluated to +1. In this 
case, the energy penalty Aq with Aq > is enforced. There will be no energy penalty, 
however, if even one of the binary variables for the i-th and j-th amino acids is different. 

The function f{i,k) is a pointer to the bit substring describing the coordinates of a 
particular amino acid. The index i points to the i-th amino acid and the index k points 
to the first bit variable of the k-th spatial coordinate. Here, k = 1 corresponds to the x 
coordinate. A; = 2 to the y coordinate, and k = 3 to the z coordinate. For example, consider 
the case with = 4 and D = 2. If we are interested in referring to the first binary variable 
describing the y coordinate {k = 2), for the third amino acid (i = 3), a direct substitution in 



Eq. 21 would yield /(3, 2) = 10, which is indeed the variable we are interested in according 



to the convention established in Eq. [T] 
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2. Primary structure constraint, H, 



psc 



The term Hpsc in Eq. 18 evaluates to zero when two amino acids P and Q that are 
consecutive sequence-wise must be nearest neighbors on the lattice. Nearest-neighbors are 
defined as those points with a rectilinear (Li) distance of dpQ = 1 between them. We define 
a distance function that gives the base 10 distance squared between any two amino acids P 
and Q on the lattice, 

D log2 N 2 

dlQ{N,D) = J2{J2 '^'''(mP,k)+r-qfiQ,k)+r)) (22) 

k=l r=l 



with f{i,k) defined as in Eq. 21 



A simple way of defining Hpsc is 

N-l 

H'pjN,D) = X,J2i^-dl^^^,r (23) 

771=1 

Or, preferably. 



N-l 



HpsciN, D) = Ai -{N + <m+i • (24) 



m=l 



Unlike Eq. |23| the improved Hamiltonian in Eq. 24 is always 2-local regardless of the number 
of amino acids or the dimensionality of the problem, since c?pq(A^, D) is always 2-local. 

First, notice that for valid configurations, all (A^ — 1) terms in the sum will equal one, and 
Hpsc{N,D) evaluates to zero. If any of the c?mm+i terms is zero, meaning that two amino 
acids occupy the same location, then HonsUe will be drastically raised by the energy penalty 
Aq. This can be achieved by setting Aq > Ai, and Ai = A^. After excluding configurations 
in which any (i^m+i are zero, only configurations with values of (i^.m+i > 1 are left. In 
these instances, Hpsc{N,D) > and Ai will play the role of an energy penalty since Ai > 0. 
Choosing Ai = A^ and Aq = A^ + 1 > Ai constrains unwanted or penalized configurations to 
eigenstates of Hp^otein with energies greater than zero, while plausible configurations of the 
protein correspond to energies less than or equal to zero. Note that the minimum energy 
of the HP model, in the case of all hydrophobic sequences with the maximum number of 
favorable contacts, is always greater than — A^. This is satisfied in general for A^ amino acids 
in either two or three dimensions. 
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3. Pairwise hydrophobic interaction term, Hpairwise 

The HP model favors hydrophobic interactions by lowering the energy by 1 whenever 
non-nearest neighboring hydrophobic amino acids are a rectilinear distance of 1 away. 
This kind of interaction is represented by the following general expression: 

N N 

Hpairwise{N ^ D) = — ^ ] ^ ] ^ ij pairwise (^^) 
i=l j=l 

Here G is an x symmetric matrix with entries Gij equal to +1 when amino acids i 
and j are hydrophobic and non-nearest neighbors, and otherwise. Note that Gij is set to 
zero for amino acids that are neighbors in the protein sequence. Notice also that alternate 
definitions of Gij could allow us to define lattice protein models that are more complex than 
the HP model. One of these models is the more realistic Miyazawa-Jernigan model [36] 
which includes interactions between 20 types of amino acids. 

The form of Wpairwise depends on the spatial dimensionality of the problem. In two 
dimensions, we have 

H'^ . . = . (N) = xV''^^(N) + x'^''^^(N) 

patrmse pairmseK J \ J ' \ J 

+yf^''{N)+y'l'^''{N) (26) 

and in three dimensions, 

■hpairwise -^pairwisei-^) (-^) ~^ (-^) 

{N) + {N) + {N) + ^F'^ (N) (27) 



The terms on the right hand side of Eq. 27 are independent; each one serves to query 



whether the j-th amino acid is located, with respect with the i-th amino acid, to the right, 
left, above, below, in front, or behind as represented by x!^'^^, x^l'^^ , y%'^^ , V-'^^ , z^^'^^ , 



and zji'^^ terms, respectively. If the j-th amino acid is located at a distance of exactly 
one in any direction, H^pairwise set to +1; otherwise it is set to zero. There is a subtle 
but important condition embedded in these terms: they all vanish if the rightmost binary 
variable describing the i-th residue's coordinate of interest (say x for x^'^^ and x^l'^^ or y 
for y^i'^^ and y^l'^^ or z for z^i'^^ and zj?'^^) does not end in 0, i.e., the coordinate has to 
correspond to an even number. This is why we intentionally double count each pair of amino 



acids in Eq. [25] by allowing both indexes i and j iterate from 1 to A^. No special treatment 
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is provided for the case where i = j, since the diagonal terms of Gij are all zero due to the 
lack of amino acid self interaction. Finally, because we want the interaction to be attractive 



rather than repulsive, we use the minus sign in Eq. 25 



The case of N amino acids in a two dimensional grid for = 2^^ and M > 3: 

The terms listed below correspond to the pairwise interaction Hamiltonian terms described 
above. The expressions below were constructed for M > 3. The four amino acid case 



(M = 2) is much simpler and will be discussed in Sec. IV The expression for xf'^'iN) IS 



x+^^iN) = (1 - g/(i,i)+i)g/o-,i)+i n ~ 9fU,i)+s 

s=2 
log2 AT 

-qf(i,i)+s + 2 qfij,i)+s <lf(i,i)+s) n ~ ^/('•'2)+r- 



r=l 



'^f{j,-2)+r + 2g/(i,2)+rg/(i,2)+r) (28) 



The first two factors of x^'^^(A^) (Eq. 28 ) treat the rightmost binary digit of the x position 
of the i-th and j-th amino acid. The first factor guarantees that the i-th residue is in an 
even position on the x-axis. For an interaction to be considered, the position of the j-th 
residue on the x-axis must be odd, as required by the second factor The remaining 

factors of x^j!'^^ are XNOR functions that ensure that the rest of the binary digits that encode 
the X position are equal for the i-th and j-th amino acids. Finally, all the digits encoding 
the y position have to be equal, so that the i-th and j-th amino acids are nearest neighbors 
displaced only in the x-directionforcing the two residues to be in the same row. If all these 
conditions are satisfied, x!j!'^^ evaluates to +1; otherwise it evaluates to 0. These conditions 
rely on the fact that adding 1 to an even number only changes the rightmost binary digit 
from to 1. 

The construction of ?/+'^^ follows the same procedure as that of x!j!'^'^, namely. 



yf^^{N) = (1 - g/(i,2)+i)g/(i,2)+i n ~ QfU,2)+s 

s=2 



-Qfm+s + 2 g/o', 2)+s <lf{i,2)+s) n ~ ^/(*' 



l)+r 



r=l 



~1f{i^)+r + 2g/(i_i)+rg/o- i)+r) (29) 
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The construction of a;*i'^'^, 



(1 - 



logjAf 
k=l 



(^/(i,l)+2 + 9/(i,l)+2 - 2 g/(j,l)+2 ?/(i,l)+2) 

M=2 n=2 

-9/(i,l)+r + 2g/(i^i)+,,(g/{j,l)+r + 
r—1 r 



r=3 



?t=2 



M=2 



JJ (1 - <lf{i,2)+s - <lf{j,2)+s + 2g/(i,2)+sg/(i,2)+s) 



(30) 



involves several considerations. As in the expression for the first factor (1 — 

tests if the z-th amino acid is in an even position along the x-axis. Here, we are interested in 

querying whether the j-th amino acid is directly to the left of the i-th, and apply a different 



procedure than that of Eq. 28, We add 00 - ■ - 01 to the x coordinate of the j-th residue, 
thus moving "right" by one unit, and use the XNOR function to check if the result matches 
the X coordinate of the i-ih. amino acid. The problem is not as trivial as the case of 
Setting i at an even coordinate value along the axis of interest forces j to be in an odd 
coordinate. However, adding 00 ■ ■ ■ 01 to an odd binary number in general will change more 
digits than just the last digit due to carry bits. We used the circuit presented in Fig. |4] 



and the Boolean algebra introduced in Sec. |III A| to obtain the general expression for the 
addition of 00 ■ ■ ■ 01 to an n-bit number. If we take x = ■ ■ ■ X2X1 and y = 00 - ■ ■ 01, 

then the result z = Zn+iZnZn-i ■ ■ ■ -22-21 for the addition z = x + y is the recursive algebraic 
expression, 

zi=0 

Z2 = I - X2 

k-1 k 

Zk = Xk + Y\_^u - '2.Y\_Xu for 3 < k <n 



u=2 



u=2 



n 



u=2 



As in the case of xi' , we impose conditions that guarantee that the y coordinate is the 
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Half-Adder Circuit 



Full-Adder Circuit 



Example: Addition of two 4-blt numbers, x = , 1413X2X1 and y = ymyiyi 



FIG. 4: Half-adder and full-adder components for the addition circuit implemented in the pairwise 
interaction Hamiltonian. We show the implementation of these two components for the addition 
of two 4-bit numbers yielding z = z^Z/iZzZ2Zi. The addition of n-bit numbers can be generalized 
trivially. 



same for both amino acids (that they are in the same row). 

A special case arises when the j-th amino acid is at the rightmost position in the grid, 
with an x coordinate value of 11 • • • 11. When 00 ■ ■ ■ 01 is added to this coordinate, Zn+i 
evaluates to 1 and the n bits zi to Zn evaluate to 0. Since only the first n bits are used 
to compare coordinates, this z would be an undesirable match with an i-th amino acid 
positioned at x = 00 ■ ■ ■ 00. Notice that a value of x = 00 ■ ■ ■ 00 positions the i-th amino acid 
positioned at the minimal/leftmost position in the grid, for which x^f'^^ should not even be 
considered. The factor [1 — nl°=i^(l ~ ?/(j,i)+A.O] Eq. 



30 



sets the term xj?'^^ to if the x 



coordinate of the i-th amino acid is 00 • ■ ■ 00, taking care of both of these concerns. 
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The construction of y^i'"^^ follows the same procedure as that of x^l''^^ , namely, 

logjAf 

y-''^^{N) = (1 - g/(i,2)+i)g/(i,2)+i [l - n ~ 

fc=i 

Clf(i,2)+k) ('?/(j,2)+2 + g/(i,2)+2 - 2 g/(j,2)+2 '?/(i,2)+2) 
log^N r-l r 

n ~ (?/(i,2)+r + n ^/(i.2)+« - 2 n ^/(i-2)+«) 

r=3 M=2 u=2 

— Q'/(j,2)+r + 2g/(i,2)+r(5'/(j,2)+r + 
r— 1 r 

n ^/(i:2)+« - 2 JJ g/(j,2)+«) 

u=2 u=2 

n ~ - + 2g/(i,i)+sg/(i, 1)+.) (31) 

s=l 

The three-dimensional extension of these equations is presented in the Appendix. 



C. Maximum locality and scaling of the number of terms in Hprotein 

In this section, we estimate the number of terms included in the total Hamiltonian Hprotein 
and present procedures required to reduce the locality of the terms to 2-local. These esti- 
mates assess the size of a quantum device necessary for eventual experimental realizations 
of the algorithm. The reduction of the locality of the terms involves ancillary qubits. 

Each amino acid requires D log2 qubits to specify its position in the lattice. Since our 
algorithm fixes the position of two amino acids, the number of qubits needed to encode the 
coordinates of the (A^ — 2) remaining amino acids is (A^ — 2)D logg A^. From the expressions 
given for HonsUe, Hp^c and Hpairwise, one can deduce that the maximum locality is determined 
by 2D log2 A^ — the number of qubits corresponding to two amino acids. As described in 



Sec. IIIB2, the Hpsc term is always 2-local in nature regardless of the number of amino 
acids. For scaling arguments, it is crucial to point out that all possible 1-local and 2- 
local terms, that account for (A^ — 2)D log2 A^ and (^(^~2)^'°S2^j total terms, repectively, 
appear in the expansion, but that not all possible 3-local or higher locality terms will be 
present. For example, the terms qiqjqk, where the indexes i, j and k are associated with 
three different amino acids, are not part of the expansion, since every term should only 
involve products of qubits describing two amino acids, regardless of its locality. Table [T] 
summarizes the number of /c-local terms required to construct the protein Hamiltonian, 
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Hprotein = Honsite+ Hpsc+ Hpairwise- The alternative count from the combinatorial expressions 
of Table |l] scales as A^^ for = 2 and as A^^ for D = 3. Table |l] provides the exact term 
count. 



TABLE I: The number of fc-local terms obtained in the final expression for HprpQio^i^i a. function 
of the number of amino acids N ^ N = 2^^, and dimensions [D) of the lattice. 



locahty 


Number of terms, 


A; = 
k = 1 
2<k< Dlog^N 
Dlog^N < k < 2D\og2N 
Total number of terms 


1 

(A^-2)Dlog2A^ 

r.') thzi rr") n^Y) +iN- 2)n^^) 

fN-2\ v^-D log2 N /D log2 N\ (D logj N\ 
\ 2 ) /^i=k-D\og^N \ i )[ k~i ) 



IV. CASE STUDY: HPPH 

With the goal of designing an experiment for adiabatic quantum computers with small 
numbers of qubits, we concentrate on the simplest possible instance of the HP-model - a 
four amino acid loop that contains a favorable interaction and therefore "folds" . 



In Sec. IV A we present the protein Hamiltonian, followed by the partitioning of the A^- 
local Hamiltonian terms to 2-local. Finally, we present numerical simulations which confirm 
the local minimum through the use of the proposed algorithm. 

A. Hamiltonian terms for the case of four amino acids in 2D 

The onsite Hamiltonian for this example takes the form 

1. Onsite term, H onsite 



HonsiteiN = 4, D = 2) = Ao(-ffo^^ite + Hl^site + 

ttU _|_ tt24: _|_ tt34 \ /ooN 
onsite ' onsite ' onsite) \ ) 
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with 

2 2 

HisueiN = 4, = 2) = n n (i - ^/(^.'^)+^ - 

k=l r=l 

1f{i,k)+r + 2 qf(i,k)+r (lf(j,k)+r ) (33) 



and 



/(z,fc)=4(2-l) + 2(fc-l). (34) 



Note that H^nsUe does not appear in Eq. 32 since, as described in Sec. II A, the two central 
amino acids are fixed in position and guaranteed not to occupy overlapping gridpoints that 
would contribute an energy penalty to the onsite term a priori . On the other hand, other 
terms involving amino acids 2 and 3 cannot be discarded, since these amino acids will affect 
their other neighbors through Hpsc and they can participate in hydrophobic interactions 

through Hpairwise- 

2. Primary structure constraint term, Hpsc 
The pairwise term 

2 2 2 

dlQ{N = 4,D = 2) = Y,iX^'^'^\mP,k)+r - qm,k)+r)) (35) 

k=l r=l 

with 

HpsciN = 4,D = 2) = Ai (-3 + dj^ + dl^ + dl^) 

= \i{-2 + dl^ + dl^) (36) 
takes advantage of the fact that rf^g = 1 by construction. 



3. Pairwise term, H, 



pairunse 



Finally, a pairwise interaction term is required to impose an energy stabilization for 
non-nearest neighbor hydrophobic amino acids that occupy adjacent sites in the lattice. 
For the sequence HPPH, 



G 



/O l\ 





\1 0/ 



(37) 
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and therefore, 

pairwisey -'-^ ^) \ pairwise ' pairwise) ' \ / 

For this particular case of interest 

Hl^^se{N = 4) = xf^'iN = 4) + X^i'^^(iV = 4) + 

yid,2D ( = 4) + y!f ( AT = 4) . (39) 

The exphcit forms of these functions are: 

xl'^^{N = 4) = (1 - g/(i,i)+i)g/OM)+i(i - ?/(i,i)+2 - 

'?/(*,i)+2 + 2 g/(i,i)+2g/{i,i)+2) 

2 

~ ?/{i,2)+^ - (lfU,2)+s + 2g/(i,2)+sg/(i,2)+s), (40) 

s=l 

yf'^^{N = 4) = (1 - g/(i,2)+i)g/o-,2)+i(l - g/(i,2)+2 - 

g/(i,2)+2 + 2 g/(i,2)+2 9/(i,2)+s) 

2 

- - + 2g/(i,i)+^g/(j- (41) 

s=l 

a;*i'2^(Ar = 4) = (1 - g/(*,i)+i)?/(i,i)+i9/{i,i)+2 

2 

('?/(j,l)+2 + 9/(i,l)+2 - 2g/(j- i)+2g/(i,l)+2) - 

s=l 

g/(i,2)+. - Qfij,2)+s + 2 g/(i,2)+5g/(i, 2)+.), (42) 

y^'^^{N = 4) = (1 - g/(i,2)+i)g/(i,2)+ig/(i,2)+2 

2 

(g/(i,2)+2 + g/(i,2)+2 - 2 g/Q- 2)+2g/(i,2)+2) JJ(l - 

Qf{i,i)+s - Qf{j,i)+s + 2 g/(i,i)+,g/Q- 1)+,). (43) 



After expanding all of the terms in HonsUe, Hpsc and Hpairwise, we fix amino acids 2 and 



3 as described in Sec. |II A[ substituting the variables quQiiQioQg QsQiQ^Qb by the constant 
values 0110 0101 as shown in Fig. |2] The final expression for Hprotem now depends on the 8 
binary variables encoding the coordinates of amino acids 1 and 4, g4g3g2'?i and qwqi5quqi3, 
respectively. For convenience in notation, we relabel the coordinates of amino acid 4 from 
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'?i6'?i5Q'i4'?i3 to qsqvQeqn- After these substitutions, the final expression for the energy function 



protein 



will be dependent on products involving the variables qi through gg. Following the 



mapping explained at the end of Sec. II A, the quantum expression for Hj 



1^ protein 



is a 2*^ X 2^ 



matrix. This Hamiltonian matrix defines the final Hamiltonian H{t = r) of the adiabatic 
evolution. The initial Hamiltonian representing the transverse field whose ground state is a 
linear superposition of all 2^ states in the computational basis can be written as 

8 

" ' (44) 



8 8 



i=l 



with 



mt = 0)) 



1 



Finally, we can construct a time dependent Hamiltonian as shown in Eq. [TT 

H{t) = {l- t/T)Ho + {t/T)Hprotein 



(45) 



(46) 



This time dependent Hamiltonian is also a 2^ x 2^ matrix as well. The instantaneous 
spectrum can be obtained by diagonalizing at every t/r without need to specify r. Since r 
is the running time, we are interested inO<t/r<l. The spectrum of the corresponding 
H(t) for this four amino acid peptide HPPH is given in Fig. pi 
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FIG. 5: (Color online) Spectrum of the instantaneous energy eigenvalues for the 8-local time 
dependent Hamiltonian used in the algorithm for the peptide HPPH (left). The plot to the right 
examines the lowest 15 states of the 256 states from the left. 



Snapshots of the instantaneous ground state are shown in Fig. [6] Even though these 
snapshots do not correspond to explicit propagation of the Schrodinger equation, they indi- 
cate that the final Hprotein is correct and that it provides the correct answer if a sufficiently 
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long time r is allowed. Notice that at t/r = 0, the amplitude for all 256 states is equal, 
indicating a uniform superposition of all states; at t/r = 1, the readout corresponds to the 
two degenerate solutions of HPPH. 



V. CONVERTING AN N-LOCAL HAMILTONIAN TO A 2-LOCAL HAMILTO- 
NIAN 

Motivated by the possibility of an experimental implementation, we explain how to reduce 
the locality of a Hamiltonian from fc-local to 2-local while conserving its low-lying spectrum. 
We use Boolean reduction techniques |371 |38] for Hamiltonians contructed from energy 
functions with structure similar to Hprotein, where all of terms are sums of tensor products 
of cr* operators. By reducing the locality of the interactions, we introduce new ancilla qubits 
to represent higher order interactions with sums of at most 2-local terms. Here, we present 
an illutrative example with a relative simple energy function but the methodology can be 
immediately extended to higher locality energy functions such as the one resulting in Hprotein- 

Consider a 4-local energy function of the form 

Htoy{(l) = 1 + gi - 92 + gs + 94 - 919293 + 91929394- (47) 

As shown in Table [Tl| this energy function has a unique minimum energy given by g = 
94939291 = 0010. The energy associated with this configuration is in arbitrary units and 
all other possible values of the binary variables 91, 92, 93 and have energies ranging from 
to 4. 

The goal is to obtain an energy function H' that preserves these energies along with their 
associated bit strings, but defines H' using only 1-local and 2-local terms. That is, the goal 
is to obtain a substitution for Htoy with the following form, 

M M-1 M 

H'iqi, ■ ■ ■ , 9Af) = Co + ^ CiQi + 5Z X] ^ij^i^j- (^8) 

i=l 1=1 j=i+l 



In Eq. |48] the new set of binary variables q includes the original variables qi as well as 
ancillary variables required to reduce locality. The extra ancillary bits raise the total number 
of variables to M. 
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TABLE II: Truth table for the energy function Htoy{q) = 1 + 51 — 92 + 93 + 94 — Q1Q2Q3 + 9i 929394- 



94 


93 


92 


91 


^^(91, 92, 93, 94) 








1 




















1 








1 


1 


1 





1 


1 





1 





1 


1 


1 


1 


1 





1 





1 











1 


2 





1 








2 


1 











2 


1 





1 


1 


2 


1 


1 


1 





2 


u 


1 


u 


1 


6 


1 








1 


3 


1 


1 








3 


1 


1 


1 


1 


3 


1 


1 





1 


4 



Since the information contained within the problem and the solution we are seeking both 
rely on the original set of q variables (in the case of protein folding, for example, the string 
q encodes the positions of the amino acids in the lattice), we must be able to identify 
values corresponding to the original q, regardless of the substitutions made to convert a 
fc-local function to a 2-local. The new energy function H' needs to have the energy values 
of the original function in its energy spectrum. In addition, the values of the bit string 
q for these energies must match the same values of q in the original function. For the 



particular example of Eq. [47], consider the substitutions, qiq2 — > q^ and q^q^ — > (Jq. These 
two subtitutions introduce two new independent binary variables, q^ and q^ and regardless 
of the values of qi, q2, q^ and ^4, they can take any value in {0, 1}. Since we want to preserve 
both the physical meaning of the original energy function, as well as its energy spectrum, 
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we need to perform an action on the cases where the conditions = qi/\ q2 and qe = qs A 54 
are not satisfied and lack any meaning in the context of the original energy function. One 
way to address this problem while keeping the original spectrum intact is to add a penalty 
function which enforces the conditions q^ = qi A q2 and qe = qs A ^4. For every substitution 
of the form q^qj qn, consider a function of the form [37] 



H^iqi, qj, qn) = (5(3g„ + qiqj - '^.qiqn - 2qjqn). 



(49) 



As shown in Table III, for 6 > 0, the function H/'^{qi, qj, qn) is greater than zero whenever 
qn 7^ qi A qj and it evaluates to zero whenever (jn = qi A qj . 

TABLE III: Truth table for the function H/^{qi, qj,qn) = <5(3^„ + qiQj — 2qiqn — "^QjQn) used for the 
locahty reduction procedure described in Sec. IVl 



qn 


Qi 


Qj 


HA{qi,qj,qn) 




















1 








1 








1 


1 


1 





1 








3(5 


1 
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6 


1 


1 





S 





1 


1 


5 



A two-local expression of the form presented in Eq. 48 can be obtained by adding one 
H/\{qi,qj,qn) function for each substitution qiq2 — ^5 and ^3^4 — > gg and by making the 
additional trivial substitutions qi qi, q2 — > ^2, qz qs, and q^ q^, to conveniently 
change in notation to the set of binary variables q . For the case of the energy function of 



Eq. 47, the locality reduced version is 



Htoy,reduced{q) = 1 + ?! - ^2 + + '?4 " + q5% + ^a('?1, ^2, ^s) + ^^a('?3, ^4, %) 

= 1 + gi - g2 + gs + '?4 - gsgs + q^qQ + 5(3g5 + 91^2 - Sgigs - 2^255) 

+ 5 (3^6 + 53^4 - 253^6 - 2^456) • (50) 
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Recall that the additional functions H/s^{qi,q2,q5) and ifA('?3, '?45 ^e) increase the energy of 
Htoy,reduced by at Icast 5 whenever the conditions = qi /\ q2 and % = q:^ /\ q^ are not 



satisfied. Table IV shows the one-to-one mapping between the energies of non-penalized 
configurations of Htoy,reducedi<i) and configurations presented in Table [ll] associated with 
Htoy{q). Even though there is a unique configuration {ge = ^3 A ^4, q^ = qi A q2, q^, qs, q2, Qi} 
associated with every {gi, q2, qs, ^4} with the same energy, it does not necessarily hold that 
the lowest 2^ out of the 2^ energies of Htoy, reduced consist of the 2^ energies of Htoy For 



example, if we pick a small penalty 5 in Table IV, say < 5 < 4, then some of the states 
penalized by either H/^{qi, q2, q^) or H/^^q^, q^, %) can still have an energy within the energy 
values of Htoy To avoid this situation, we can choose 5 > max(iftoy) which will be sufficient 
to remove the energies of the penalized states from the region corresponding to energies of 
Htoy, therefore conserving the low- lying spectra of the original Htoy Using the mapping 



explained at the end of Sec. II A , the quantum version of the 4-local energy function from 
is: 



Eq. 47 



Htoy = I + Qi- q2 + q3 + Qi- qiq2qz + qimsq^- (51) 



The quantum version of the 2-local reduced form presented in Eq. 50 



Htoy, reduced = / + ^2 + ^3 + ^4 - ^5^3 + 95^6 + 5(3^5 + qiq2 " 2gig5 - 2q2q^) 

+ 5 (3g6 + - 2g3% - 2^4%) (52) 

Notice that Htoy acts on a 2'^ dimensional Hilbert space, span{|g4) ® Iga) ® \q2) ® \qi)}i while 
Htoy,reduced acts ou a 2^ dimeusioual Hilbert space, spanjlgg) ® Iq^) ® |g4) Igs) ® \q2) ® |gi)}- 
Due to the conservation of the spectrum and bit strings described above (as reflected in 
Tables [n] and IV), the solution obtained from an adiabatic quantum algorithm using either 

Htoy 01' Htoy , reduced aS Hjiriali 

H{t) = (1 - t/T)H{0) + {tlT)Hf,nal (53) 

should be the same. 

In the case of the 2-local Hamiltonian Htoy,reduced, the solution to the optimization problem 
is obtained using an adiabatic algorithm after reading the qubits associated to q4,,q3,q2,qi 
at t = T from the space spanjlge) ® \q5) ® \q4) ® Iqs) ® \q2) ® \qi)} at t = r. Notice 
that the ancillary qubits in the six qubit version do not carry any physical information, as 
expected, since all of the valuable information was stored in the qubits coming from the 
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original expression before the reduction. The cost of reducing the locahty of a Hamiltonian 
to another which contains at most two-body interactions is the increase in the number of 
resources due to the additional ancillary bits. 

Figure M shows the the eigenenergies of Eq. 53 vs. t/r, where H final is replaced by Htoy 



(see Figure [7[a)), and by Htoy,reduced with 6 = 5, (see Fig. [7[b)). As expected from Table 
and IV, Fig. [7] illustrates the preservation of the subsystem corresponding to the variables 
Qiy(l2,Q3 and ^4 in the ground state of both the original and reduced-locality Hamiltonian. 
Degeneracy and overlap of lines in the spectra in Fig. [7] make it difficult to graphically convey 
that both spectra in Fig. [T] indeed have 16 states for < eigenenergies < 4. In Fig. |7[b) 
we plotted the first 19 eigenstates out of the 2® eigenstates corresponding to H toy, reduced- 
At t/r = 1, states with energy greater than 4 correspond to states which violate the and 
condition introduced by the reduction process. Notice that there are two eigenstates with 
eigenvalue 5 in agreement with the table presented in Appendix |B] after substituting 5 = 5, 
and one state which corresponds to the one of the four-degenerate manifold with E = 6. 





= iqo ;> 



0,1 0.2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 



IV'g{-^)) = 194 939291) 



FIG. 7: (Color online) Spectrum comparison of the instantaneous energy eigenvalues for the 4- 
local toy Hamiltonian Htoy (left) and its corresponding 2-local version Htoy,reduced{^'^S^^)- (left) Full 
spectrum of the 2^ instantaneous eigenvalues for Htoy{Q,Q2,Q3,Q4:)- (right) First 19 instantaneous 
eigenvalues for the 2-local version of Htoy, denoted as Htoy,reduced in text. The value used for 6 is 5. 
The first 2^ levels, < eigenvalues < 4, are associated to the original levels from Htoy The three 
remaining states with eigenvalues greater than 4 are penalized states which violate the conditions 



Qn = Qi ^ Qj (see Table IV for details) 



27 



VI. RESOURCES NEEDED FOR A 2-LOCAL HAMILTONIAN EXPRESSION IN 
PROTEIN FOLDING 



For any /c- local energy function, e.g., h — qiq2 ■ ■ ■ qk, the reduction can be carried out 
iteratively, adding the penalty function H/\{qi, qj, qn) for every substitution of the form qiqj —>■ 
qn- For a /c-local term, {k — 2) substitutions are required for the reduction to 2-local, and 
therefore require {k — 2) ancillary bits. 

In the particular case of the protein Hamiltonian the reduction procedure needs to be 
repeated {N — 2){N^ — D\og2N — 1) times, as described below. All the terms in the 
HP Hamiltonian include among at most interactions two amino acids, which results in 
a maximum locahty of 2D log2 N. In the following discussion, the cluster notation [k] [I] 
specifies the contributions of a particular {k + /)— local term into k variable coming from 
an amino acid with index i and / variables from an amino acid with index j. Since all 
the terms are of this form, to obtain a 2-local Hamiltonian, all products corresponding to 
each [k] and [I] of each cluster have to be converted to 1-local terms. We reduce terms 
for variables describing each amino acid in turn, for a total of D\og2N variables. All 
possible combinations of two variables from the D log2 N variables for an amino acid are 
substituted. The number of ancillary bits required for this substitution is These 
substitutions convert all terms of the form [3][0] and [2][1] to 2-local. To convert terms of 
the form [4][0] or [3][1] to 2-local we need to consider ^^^°|2^^ terms originally containing 
three variables from one amino acid. After employing an additional ancillary bit per term 
and applying the previous reduction step, all these terms collapse to 1-local with respect 
to the i-th amino acid, i.e., these terms will assume the form Iterating over the 

D log2 N variables for a specific amino acid in order of increasing locality will give us the 
number of substitutions or ancilla bits needed per amino acid in order to reduce a particular 
cluster [k] to [1] or 1-local. The total number of substitutions per amino acid corresponds to 
Y.k=T^ i^^T^) - Dlog2 N-l.To carry out the procedure for all {N - 2) amino 

acids the number of ancilla qubits required is {N — 2){N^ — D\og2 N — 1). The number 
of qubits needed to represent a 2-local Hamiltonian version of the protein Hamiltonian is 
given by adding the number of ancillary qubits to the number of original {N — 2)Dlog2 
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quantum bits, 

# of total qubits for a 2-local expression = (A^ - 2){N^ - D logs N - 1) + {N - 2)D log^ N 

= {N - 2){N'^ - 1) (54) 



Eq. 54 provides a closed formula for the number of qubits needed to find the lowest energy 
conformations for a protein with amino acids in D dimensions in our encoding. In 
particular, for the case of a four amino acid peptide HPPH in two dimensions considered in 
Sec. [IV] requires 30 qubits. 



VII. CONCLUSIONS 



We constructed the essential elements of an adiabatic quantum algorithm to find the 
lowest energy conformations of a protein in a lattice model. The number of binary variables 
needed to represent N amino acids on an x lattice is (A^ — 2)D logs The maximum 
locality of the final Hamiltonian, as determined by the interaction between pairs of amino 
acids using the mapping defined here, is 2D logg A^. 

General strategies to construct energy functions to map into other quantum mechanical 
Hamiltonians used for adiabatic quantum computing were presented. The strategies used in 
the construction of the Hamiltonian for the HP model can be used as general building blocks 
for Hamiltonians associated with physical systems where onsite energies and/or pairwise 
potentials are present. 

We also demonstrated an application of the Boolean scheme for converting a fc-local 
Hamiltonian into a 2-local Hamiltonian, aiming toward an experimental implementation in 
quantum devices. The resulting couplings, although 2-local, do not necessarily represent 
couplings among nearest neighbor quantum bits in a two-dimensional geometry. It is how- 
ever known that the number of ancillary physical qubits required to embed an arbitrary 
A^ variable problem is upper-bounded by N'^/{C — 2), where C is the number of couplers 
allowed per physical qubit. 

The most important question remaining to be explored in future work is the scaling of 
run time r with respect to the number of amino acids A^. Run time r is dependent on 
the particular instance of the problem - in our case, to different protein sequences. It has 
been proposed that proteins have evolved towards a many-dimensional funnel-like potential 
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energy surface [7]. The sequences that show a funnel-hke structure might be easier to study 
using adiabatic quantum computation, because the funnel structure may facihtate anneahng 
of the quantum wave function toward low energy conformations. 
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APPENDIX A: EXTENSION OF THE PAIRWISE INTERACTION TO THREE 
DIMENSIONS AND N AMINO ACIDS , N = 2^^ AND Af > 3 



This extension follows the principles presented in Sec. Ill B 3 and extends the terms of 
the Hamiltonian to the case of a three-dimensional lattice protein. The pairwise term for 
the three-dimensional case is, 

N 

■^pairwisei-^) ^ ^ ^ ij pairwise i.-^^) 



X 



ij,3D 



(N) = (1 - n ~ ^/O' 



s=2 
log, 



-Qfii,i)+s + 2g/(j- Qfii,i)+s) n ~ 9fii,2)+s 



s=l 
log2 Af 



r=l 

-9/(i,3)+r- + 2g/(i,3)+rg/(i,3)+r), (A2) 
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yf^^{N) = (1 - g/(i,2)+i)g/(:,-,2)+i Jl - Qf{j,2)+s 

s=2 

-(lf{i,2)+s + 2g/(j,2)+s Qf{i,2)+s) II (1 - Qf{i,l)+s 



s=l 
log, AT 



r=l 



-Qfim+r + 2?/(i,3)+r?/0-,3)+r), 



log2 AT 



ij,3D 



(N) = (1 - g/(i,3)+i)g/(,-3)+i II (1 - mj,3)+s 



s=2 
log2 AT 



-?/(j,3)+s + 2 g/(j,3)+s ?/(i,3)+s) H (1 - Qf{i,l)+s 



s=l 
log2 Af 



j,2)+r 



~<lfij,2)+r + 2?/(j,2)+r-g/(j,2)+r), 



(TV) = (1 - 1 - J] (1 - 



log2 AT 



fc=l 



(?/(i,l)+2 + qf{i,l)+2 - 2 l)+2 ?/(i,l)+2) 
log2 N r-1 r 

n [-^ ~ + n ~ 2 II qf{j,i)+u) 



r=3 



u=2 



u=2 
r-1 



-qf{i,i)+r + 2g/(^,i)+^(g/(j- 1)+^ + H qf{j,i)+^ 



u=2 



u=2 s=l 



log2 Af 



2g/(i,2)+sg/(j,2)+s) H (1 - 9/(i,3)+r " 
qf{j,S)+r + 2g/(i,3)+r?/0-,3)+r-), 
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y!f'^^(7V) = (1 - g/(,2)+i)g/o-,2)+i 1 - n (1 - 



log2 Af 



fe=l 



qf{i,2)+k) (9/(j,2)+2 + ?/(i,2)+2 - 2 g/(j,2)+2 9/(i,2)+2) 
log2 r-1 

n [-^ ~ (5/(i,2)+r + n ^/(i.2)+" - 



r=3 



u=2 



2 H ?/0-,2)+«) - 9/(i,2)+r + 2g/(j_2)+r(g/(i,2)+r + 
u=2 

r— 1 r 

n ^/(j>2)+« - 2 n 2)+«) 

M=2 u=2 

log2 AT 



log2 AT 



U (1 - ?/(i,3)+r - Qf{j,3)+r + 2g/(j,3)+^g/(j- 3)+^) , 



(AT) = (1 - g^(i^3)+i)g^(j.3)+i 1 - JJ (1 - 



log2 AT 



ik=l 



g/(i,3)+fe) (?/(j,3)+2 + ?/(i,3)+2 - 2 ?/(j,3)+2 ?/(i,3)+2) 
log2 Af^ r-1 

n [-^ ~ (5/o-,3)+r + n ^/o.3)+« - 

r=3 M=2 

r 

2 H ?/(j,3)+«) - ?/(«,3)+r + 2qf^i^3)+r{qf{j,3)+r + 
u=2 

1 — 1 r 

n ^/(j.3)+« - 2 g/(j,3)+«)] 

u=2 u=2 

log2 AT 

n ~ - + 2g/(i,i)+^9/(i,i)+s) 

log2 AT 

n (1 - ?/(«,2)+r - ?/(j,2)+r + 2g/(j,2)+r?/(i,2)+r) ■ 
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TABLE IV: Truth table for the energy function Htoy,reduced{Q) = ^ + <ji — Q2 + Q3 + — Q5Q3 + 
Q5Q6 + (^(3^5 + - 2^195 - 25^2^5) + (5(3^6 + 93^4 - 29396 - 2g496)- The top of the table shows 
the 16 non-penalized states that satisfy gs = A 92 and % = qs A q4,. These 16 states map one 
to one to the states in Table A sample of the remaning 48 penalized states are shown after the 



breaking line. 
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